Overview

This data is RNA-seq data from monocytes and CD4 T cells from healthy control patients, patients with relapsing-remitting MS (RRMS) and patients with secondary progessive MS (SPMS). The monocytes and CD4 T cells are taken from the same sample. Each sample was run twice as technical replicates. The affected were age matched with the controls, so there are 3 patients that should be analyzed together if possible in each group. They would like to look at all pairwise comparisons with the controls, RRMS and SPMS samples within each cell type. They are not interested in looking across the different cell types.

The data is RNA-seq 3’DGE data– it appears as if it is SCRB-seq data to me or something similar to that. The table of counts have been provided, we have the total counts and the UMI deduped counts; we’ll be using the deduped counts provided by the Broad.

This work is for the Gandhi lab, and Maria Mazzola is the person leading the project.

Data cleaning

First we’ll read in the count data generated by the Broad and convert the column names to the well numbers:

> library(readr)
> library(dplyr)
> library(tidyr)
> library(ggplot2)
> library(cowplot)
> library(viridis)
> set.seed(127187, kind = NULL, normal.kind = NULL)
> counts = read_tsv("../data/SSF-1505_plate_GTAGAGGA.unq.refseq.umi.dat") %>% 
+     as.data.frame()
> rownames(counts) = counts[, 1]
> counts[, 1] = NULL
> cnames = unlist(lapply(strsplit(colnames(counts), split = "_"), tail, n = 1))
> colnames(counts) = cnames
> head(counts)
         A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 B1 B2 B3 B4 B5 B6 B7 B8 B9
A1BG      9 13  1  0  4  6  4  5  6  10   3   5 12  7  0  2 21 10  6  4 10
A1BG-AS1  2  0  1  1  1  1  0  0  0   0   0   0  1  1  0  0  3  2  0  0  0
A1CF      0  0  0  1  0  0  0  0  0   0   0   0  0  0  0  0  0  0  0  0  0
A2M       3  0  4  3  1  0  3  3  0   0   8  11  0  0  0  2  1  1  2  0  0
A2M-AS1   0  2  5  5  2  0  2  2  0   0   4   5  2  0  2  2  0  1  1  0  1
A2ML1     5  5  0  3  3  4  1  2  4   3   3   2  6  3  3  3  5  2  3  1  3
         B10 B11 B12 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 D1 D2 D3 D4 D5
A1BG      11   4   3  8  4  5  6  5  3  8  3  9   9   1   4  4  6  2  3  6
A1BG-AS1   0   0   0  2  1  0  2  0  1  1  2  0   1   1   0  0  1  0  1  0
A1CF       0   0   0  1  0  0  0  0  0  0  0  0   0   0   0  0  0  0  0  0
A2M        1   4   3  0  0  0  1  1  0  1  0  1   0   5   1  0  0  1  4  0
A2M-AS1    1   4   1  0  1  0  1  3  4  3  0  0   0   1   3  2  3  2  0  0
A2ML1      3   2   2  2  0  2  4  9  5  1  2  4   2   6   3  2  4  0  2  7
         D6 D7 D8 D9 D10 D11 D12 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 F1
A1BG      6  3  3  2   4   5   4  1  3  4  6  8  6  4  6  8   5   3   4  4
A1BG-AS1  1  0  0  0   0   1   0  0  2  0  0  0  1  2  0  0   0   4   0  0
A1CF      0  1  0  0   0   0   0  0  0  0  0  0  0  0  0  0   0   0   0  0
A2M       0  2  7  0   2   2   1  0  0  3  0  0  1  1  4  0   1   5   4  0
A2M-AS1   1  2  4  0   2   3   2  0  2  3  3  3  2  3  2  0   0   1   7  0
A2ML1     2  2  1  4   0   4   0  0  6  4  1  3  2  1  3  3   1   1   2  3
         F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 G1 G2 G3 G4 G5 G6 G7 G8 G9
A1BG     10  5  2  3  8  5  4  9   8   2   0  3  6  2  2  3  4  2  2  4
A1BG-AS1  0  0  2  1  1  0  0  0   1   0   0  0  0  0  0  0  0  0  0  0
A1CF      0  0  0  0  0  0  0  0   0   0   0  0  0  0  0  0  0  0  0  0
A2M       0  1  1  0  0  0  3  0   0   5   4  1  0  1  2  0  0  4  7  0
A2M-AS1   1  2  3  1  1  3  2  0   2   1   2  2  0  1  2  0  2  3  2  0
A2ML1     4  2  3  4  1  1  2  1   1   1   3  0  4  1  1  5 10  1  2  4
         G10 G11 G12 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12
A1BG       0   1   1 10  5  5  3  6  3  0  6  3   4   4   3
A1BG-AS1   0   0   0  0  1  0  0  1  0  0  2  1   3   1   0
A1CF       0   0   0  0  0  0  0  0  0  0  0  0   0   0   0
A2M        1   1   0  0  0  0  1  0  1  1  4  0   0   2   1
A2M-AS1    1   0   1  0  1  3  0  0  0  0  0  0   0   1   1
A2ML1      5   1   1  4  2  1  0  1  2  1  1  4   2   0   3

We’ll also read in the CSV of metadata and place some of the information into more specific columns. We added columns for phenotype, group, celltype and sample to make it easier to work with the samples later.

> metadata = read_csv("../metadata/sample-info.csv") %>% tidyr::separate(samplename, 
+     c("sample", "celltype"), sep = "_")
> metadata$phenotype = substr(metadata$sample, 1, 2)
> metadata$group = as.factor(substr(metadata$sample, 3, 3))
> metadata$treated = ifelse(metadata$past_treat == "none", "untreated", "treated")
> head(metadata)
Source: local data frame [6 x 16]

   cell sample celltype   rin   age gender    sampled  processed
  <chr>  <chr>    <chr> <dbl> <int>  <chr>      <chr>      <chr>
1    A1    HC1     mono   9.4    35      F 12.02.2014 01.27.2015
2    A2    HC1     mono   9.4    35      F 12.02.2014 01.27.2015
3    A3    HC1   Tcells   8.9    35      F 12.02.2014 01.27.2015
4    A4    HC1   Tcells   8.9    35      F 12.02.2014 01.27.2015
5    A5    RR1     mono   9.3    36      F 05.13.2013 02.13.2015
6    A6    RR1     mono   9.3    36      F 05.13.2013 02.13.2015
Variables not shown: viable_pct <dbl>, edss <dbl>, duration <int>,
  past_treat <chr>, cur_treat <chr>, phenotype <chr>, group <fctr>,
  treated <chr>.

Sanity checking

Now we’ll sanity check the metadata. Samples in the sample group should be all the same gender and should be similar in age. We’ll plot those to make sure I didn’t screw anything up.

> ggplot(metadata, aes(group, age, color = gender, size = edss)) + geom_jitter() + 
+     theme(text = element_text(family = "Gill Sans")) + scale_color_viridis(discrete = TRUE)

Looks good.

The count data has the correct number of columns 96 and a reasonable number of rows 23895. There are no missing values for the counts or anything like that so that data looks good to go as well.

Count based quality analysis

Here we calculate a bunch of sumary statistics at the level of the cell.

> cellstats = data.frame(cell = colnames(counts), total_counts = colSums(counts), 
+     genes_detected = colSums(counts > 0), robust_genes_detected = colSums(counts > 
+         10)) %>% left_join(metadata, by = "cell")

Total counts

Overall the total counts per cell has a pretty even distribution, hovering around 0.5 - 1 million counts per cell. Groups 7 and 8 seem to have a consistently lower counts per cell than the rest of the samples. Usually I colored the bars by rin number, age, gender and viable_pct (not shown) but none of those seemed to correspond to the lower number of counts.

> ggplot(cellstats, aes(cell, total_counts)) + geom_bar(stat = "identity") + facet_wrap(~group, 
+     scales = "free_x") + theme(axis.text.x = element_text(angle = 90, hjust = 1, 
+     size = 8), text = element_text(family = "Gill Sans")) + xlab("") + ylab("total counts")

Genes detected

The plots of genes detected looks pretty good, a nice even amount for all of the samples. The number of genes detected is high, around half of the queried genes per cell, which is good. Groups 7 and 8 tend to have a slightly lower number of genes detected than the other cells, which we expected due to having a lower number of counts.

> ggplot(cellstats, aes(cell, genes_detected)) + geom_bar(stat = "identity") + 
+     facet_wrap(~group, scales = "free_x") + theme(axis.text.x = element_text(angle = 90, 
+     hjust = 1, size = 8), text = element_text(family = "Gill Sans")) + xlab("") + 
+     ylab("genes detected")

Library complexity

Here we plot genes detected vs the total number of counts. We see a clear relationship between the number of genes detected and the total number of counts, suggesting that we could detect even more genes if we sequenced these samples more deeply. This plot is mainly to detect if there are groups of failed cells that don’t seem to follow along the polynomial line we fit.

> ggplot(cellstats, aes(total_counts, genes_detected)) + geom_point() + geom_smooth(method = "lm", 
+     formula = y ~ poly(x, 2), size = 1) + theme(text = element_text(family = "Gill Sans")) + 
+     xlab("total counts") + ylab("genes detected")

PCA

Here we run principal component analysis (PCA) on the count data. Before performing PCA we center and scale the data and correct for the mean-variance relationship in RNA-seq data by performing a variance stabilizing transformation on the data. This ensures that low expressed genes with high variance do not dominate the PCA plot.

After variance stabilizing the data we take the top 500 most variable genes and use them to perform PCA.

> library(DESeq2)
> design = ~group + celltype + phenotype + phenotype:celltype
> dds = DESeqDataSetFromMatrix(countData = counts, colData = metadata, design = design)
> vst = varianceStabilizingTransformation(dds)
> pca_loadings = function(object, ntop = 500) {
+     rv <- matrixStats::rowVars(assay(object))
+     select <- order(rv, decreasing = TRUE)[seq_len(min(ntop, length(rv)))]
+     pca <- prcomp(t(assay(object)[select, ]))
+     percentVar <- pca$sdev^2/sum(pca$sdev^2)
+     names(percentVar) = colnames(pca$x)
+     pca$percentVar = percentVar
+     return(pca)
+ }
> pc = pca_loadings(vst)
> comps = data.frame(pc$x)
> comps$Name = rownames(comps)
> library(dplyr)
> comps = comps %>% left_join(metadata, by = c(Name = "cell"))
> pca_plot = function(comps, nc1, nc2, colorby) {
+     library(cowplot)
+     library(viridis)
+     c1str = paste0("PC", nc1)
+     c2str = paste0("PC", nc2)
+     ggplot(comps, aes_string(c1str, c2str, color = colorby)) + geom_point() + 
+         xlab(paste0(c1str, ": ", round(pc$percentVar[nc1] * 100), "% variance")) + 
+         ylab(paste0(c2str, ": ", round(pc$percentVar[nc2] * 100), "% variance"))
+ }

PC1 vs. PC2

We can see a great separation by cell type along PC1:

> pca_plot(comps, 1, 2, "celltype")

PC2 vs. PC3

We see plotting PC2 vs PC3 a more subtle difference between phenotypes. RR and MS look mixed but there seems to be a separation between the control and the affected samples.

> pca_plot(comps, 2, 3, "phenotype")

PCA, cell type separation

Part of the reason for the poor higher component separation is we are using the 500 genes that are dominated by cell type. If we first separate out the cell types and then do the PCA, we might see a better separation based on phenotype.

> monometa = subset(metadata, celltype == "mono")
> monocounts = counts[, monometa$cell]
> tcellmeta = subset(metadata, celltype == "Tcells")
> tcellcounts = counts[, tcellmeta$cell]
> design = ~group + phenotype
> monodds = DESeqDataSetFromMatrix(countData = monocounts, colData = monometa, 
+     design = design)
> tcelldds = DESeqDataSetFromMatrix(countData = tcellcounts, colData = tcellmeta, 
+     design = design)
> monovst = varianceStabilizingTransformation(monodds)
> tcellvst = varianceStabilizingTransformation(tcelldds)
> monopc = pca_loadings(monovst)
> tcellpc = pca_loadings(tcellvst)
> monocomps = data.frame(monopc$x)
> tcellcomps = data.frame(tcellpc$x)
> monocomps$cell = rownames(monocomps)
> tcellcomps$cell = rownames(tcellcomps)
> monocomps = monocomps %>% left_join(metadata, by = c(cell = "cell"))
> tcellcomps = tcellcomps %>% left_join(metadata, by = c(cell = "cell"))

T-cell

PC1-PC2

We can see a separation along component 2 with which gender the samples are from:

> pca_plot(tcellcomps, 1, 2, "gender")

> pca_plot(tcellcomps, 1, 2, "phenotype")

> pca_plot(tcellcomps, 1, 2, "cur_treat")

> pca_plot(tcellcomps, 1, 2, "past_treat")

> pca_plot(tcellcomps, 1, 2, "duration") + scale_color_viridis()

But component 1 is a little bit of a mystery, it does kind of separate by phenotype, all of the HC samples are clustered on the left and the right is more mixed MS/RR.

PC3-PC4

> pca_plot(tcellcomps, 3, 4, "gender")

> pca_plot(tcellcomps, 3, 4, "phenotype")

> pca_plot(tcellcomps, 3, 4, "cur_treat")

> pca_plot(tcellcomps, 3, 4, "past_treat")

> pca_plot(tcellcomps, 3, 4, "duration") + scale_color_viridis()

Monocyte

We don’t see the same gender based separation in the monocytes that we saw with the Tcells.

PC1-PC2

There isn’t a separation based on gender in PC1-PC2, it isn’t clear what is separating these cells because they also don’t separate by phenotype. They don’t seem to separate very well on current treatment or by

> pca_plot(monocomps, 1, 2, "gender")

> pca_plot(monocomps, 1, 2, "phenotype")

> pca_plot(monocomps, 1, 2, "cur_treat")

> pca_plot(monocomps, 1, 2, "past_treat")

> pca_plot(monocomps, 1, 2, "duration") + scale_color_viridis()

PC3-PC4

Samples do separate along PC3 and PC4 by gender. So for T-cells there seems to be a larger gender effect than the monocytes.

> pca_plot(monocomps, 3, 4, "gender")

> pca_plot(monocomps, 3, 4, "phenotype")

> pca_plot(monocomps, 3, 4, "cur_treat")

> pca_plot(monocomps, 3, 4, "past_treat")

> pca_plot(monocomps, 3, 4, "duration") + scale_color_viridis()

Filtering and combining

The cells all look good so we don’t need to do any filtering of the individual cells. We’ll drop all genes that don’t have any counts in any cells, and then combine the counts for all of the technical replicate cells by adding them together before doing differential expression.

> monomelt = monocounts %>% add_rownames() %>% tidyr::gather(cell, count, -rowname) %>% 
+     left_join(monometa, by = "cell") %>% dplyr::select(rowname, sample, count) %>% 
+     group_by(rowname, sample) %>% summarise(total = sum(count)) %>% tidyr::spread(sample, 
+     total) %>% as.data.frame()
> rownames(monomelt) = monomelt$rowname
> monomelt$rowname = NULL
> monomelt = monomelt[rowSums(monomelt) > 0, ]
> monometa = monometa %>% group_by(sample) %>% filter(row_number() == 1) %>% dplyr::select(-cell)
> monometa = as.data.frame(monometa)
> rownames(monometa) = monometa$sample
> monomelt = monomelt[, rownames(monometa)]
> 
> tcellmelt = tcellcounts %>% add_rownames() %>% tidyr::gather(cell, count, -rowname) %>% 
+     left_join(tcellmeta, by = "cell") %>% dplyr::select(rowname, sample, count) %>% 
+     group_by(rowname, sample) %>% summarise(total = sum(count)) %>% tidyr::spread(sample, 
+     total) %>% as.data.frame()
> rownames(tcellmelt) = tcellmelt$rowname
> tcellmelt$rowname = NULL
> tcellmelt = tcellmelt[rowSums(tcellmelt) > 0, ]
> tcellmeta = tcellmeta %>% group_by(sample) %>% filter(row_number() == 1) %>% 
+     dplyr::select(-cell)
> tcellmeta = as.data.frame(tcellmeta)
> rownames(tcellmeta) = tcellmeta$sample
> tcellmelt = tcellmelt[, rownames(tcellmeta)]

Differential expression

We will fit the model to the two different celltypes separately. We fit a model of the form: ~group+phenotype where we compare only the cells from the same group to each other and look for differences in phenotype.

> plotMA_full = function(res) {
+     ymax = max(res$log2FoldChange, na.rm = TRUE)
+     ymin = min(res$log2FoldChange, na.rm = TRUE)
+     plotMA(res, ylim = c(ymin, ymax))
+ }

Monocytes

The dispersion plot looks awesome, in this plot the black dots are the estimates of the dispersion if you just considered each gene by itself, and the red line is the best fit of the dispersion for genes of a given expression value. The blue dots are the final value for the dispersions, which for outlier spots are pulled back towards the fitted line.

> design = ~group + phenotype
> monodds = DESeqDataSetFromMatrix(countData = monomelt, colData = monometa, design = design)
> monodds = estimateSizeFactors(monodds)
> monodds = DESeq(monodds)
> plotDispEsts(monodds)

RR vs HC

> monorr_vs_hc = results(monodds, contrast = c("phenotype", "RR", "HC"))
> plotMA_full(monorr_vs_hc)

> monorr_vs_hc = monorr_vs_hc[order(monorr_vs_hc$padj), ]
> monorr_vs_hc = data.frame(cbind(data.frame(gene = rownames(monorr_vs_hc)), monorr_vs_hc))
> monorr_vs_hc_sig = subset(monorr_vs_hc, padj < 0.1)
> write.table(monorr_vs_hc, file = "mono-rr-vs-hc.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 26 genes different between RR and HC samples in the monocytes, using a FDR cutoff of 0.1. Of those 10 are up in the RR samples and 16 are down in the RR samples.

SP vs HC

> monosp_vs_hc = results(monodds, contrast = c("phenotype", "SP", "HC"))
> plotMA_full(monosp_vs_hc)

> monosp_vs_hc = monosp_vs_hc[order(monosp_vs_hc$padj), ]
> monosp_vs_hc = data.frame(cbind(data.frame(gene = rownames(monosp_vs_hc)), monosp_vs_hc))
> monosp_vs_hc_sig = subset(monosp_vs_hc, padj < 0.1)
> write.table(monosp_vs_hc, file = "mono-sp-vs-hc.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 76 genes different between SP and HC samples in the monocytes, using a FDR cutoff of 0.1. Of those 35 are up in the SP samples and 41 are down in the SP samples.

SP vs RR

> monosp_vs_rr = results(monodds, contrast = c("phenotype", "SP", "RR"))
> plotMA_full(monosp_vs_rr)

> monosp_vs_rr = monosp_vs_rr[order(monosp_vs_rr$padj), ]
> monosp_vs_rr = data.frame(cbind(data.frame(gene = rownames(monosp_vs_rr)), monosp_vs_rr))
> monosp_vs_rr_sig = subset(monosp_vs_rr, padj < 0.1)
> write.table(monosp_vs_rr, file = "mono-sp-vs-rr.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 1 genes different between SP and RR samples in the monocytes, using a FDR cutoff of 0.1. Of those 1 are up in the SP samples and 0 are down in the SP samples.

T-cells

> design = ~group + phenotype
> tcelldds = DESeqDataSetFromMatrix(countData = tcellmelt, colData = tcellmeta, 
+     design = design)
> tcelldds = estimateSizeFactors(tcelldds)
> tcelldds = DESeq(tcelldds)
> plotDispEsts(tcelldds)

RR vs HC

> tcellrr_vs_hc = results(tcelldds, contrast = c("phenotype", "RR", "HC"))
> plotMA_full(tcellrr_vs_hc)

> tcellrr_vs_hc = tcellrr_vs_hc[order(tcellrr_vs_hc$padj), ]
> tcellrr_vs_hc = data.frame(cbind(data.frame(gene = rownames(tcellrr_vs_hc)), 
+     tcellrr_vs_hc))
> tcellrr_vs_hc_sig = subset(tcellrr_vs_hc, padj < 0.1)
> write.table(tcellrr_vs_hc, file = "tcell-rr-vs-hc.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 1 genes different between RR and HC samples in the T-cells, using a FDR cutoff of 0.1. Of those 0 are up in the RR samples and 1 are down in the RR samples.

SP vs HC

> tcellsp_vs_hc = results(tcelldds, contrast = c("phenotype", "SP", "HC"))
> plotMA_full(tcellsp_vs_hc)

> tcellsp_vs_hc = tcellsp_vs_hc[order(tcellsp_vs_hc$padj), ]
> tcellsp_vs_hc = data.frame(cbind(data.frame(gene = rownames(tcellsp_vs_hc)), 
+     tcellsp_vs_hc))
> tcellsp_vs_hc_sig = subset(tcellsp_vs_hc, padj < 0.1)
> write.table(tcellsp_vs_hc, file = "tcell-sp-vs-hc.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 75 genes different between sp and HC samples in the T-cells, using a FDR cutoff of 0.1. Of those 37 are up in the SP samples and 38 are down in the SP samples.

SP vs RR

> tcellsp_vs_rr = results(tcelldds, contrast = c("phenotype", "SP", "RR"))
> plotMA_full(tcellsp_vs_rr)

> tcellsp_vs_rr = tcellsp_vs_rr[order(tcellsp_vs_rr$padj), ]
> tcellsp_vs_rr = data.frame(cbind(data.frame(gene = rownames(tcellsp_vs_rr)), 
+     tcellsp_vs_rr))
> tcellsp_vs_rr_sig = subset(tcellsp_vs_rr, padj < 0.1)
> write.table(tcellsp_vs_rr, file = "tcell-sp-vs-rr.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 0 genes different between sp and RR samples in the T-cells, using a FDR cutoff of 0.1. Of those 0 are up in the SP samples and 0 are down in the SP samples.

MS vs control

MS vs control - monocytes

> monometa$affected = ifelse(monometa$phenotype == "HC", "unaffected", "affected")
> design = ~group + affected
> monoaffecteddds = DESeqDataSetFromMatrix(countData = monomelt, colData = monometa, 
+     design = design)
> monoaffecteddds = estimateSizeFactors(monoaffecteddds)
> monoaffecteddds = DESeq(monoaffecteddds)
> monoaffected = results(monoaffecteddds, contrast = c("affected", "affected", 
+     "unaffected"))
> plotMA_full(monoaffected)

> monoaffected = monoaffected[order(monoaffected$padj), ]
> monoaffected = data.frame(cbind(data.frame(gene = rownames(monoaffected)), monoaffected))
> monoaffected_sig = subset(monoaffected, padj < 0.1)
> write.table(monoaffected, file = "monoaffected.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 155 genes different between affected and unaffected monocytes, using a FDR cutoff of 0.1. Of those 67 are up in the affected samples and 88 are down in the affected samples.

MS vs control - Tcells

> tcellmeta$affected = ifelse(tcellmeta$phenotype == "HC", "unaffected", "affected")
> design = ~group + affected
> tcellaffecteddds = DESeqDataSetFromMatrix(countData = tcellmelt, colData = tcellmeta, 
+     design = design)
> tcellaffecteddds = estimateSizeFactors(tcellaffecteddds)
> tcellaffecteddds = DESeq(tcellaffecteddds)
> tcellaffected = results(tcellaffecteddds, contrast = c("affected", "affected", 
+     "unaffected"))
> plotMA_full(tcellaffected)

> tcellaffected = tcellaffected[order(tcellaffected$padj), ]
> tcellaffected = data.frame(cbind(data.frame(gene = rownames(tcellaffected)), 
+     tcellaffected))
> tcellaffected_sig = subset(tcellaffected, padj < 0.1)
> write.table(tcellaffected, file = "tcellaffected.csv", row.names = FALSE, col.names = TRUE, 
+     quote = FALSE, sep = ",")

We flag 61 genes different between affected and unaffected tcells, using a FDR cutoff of 0.1. Of those 29 are up in the affected samples and 32 are down in the affected samples.

Pathway analysis

> biomart_dataset = "hsapiens_gene_ensembl"
> keggname = "hsa"
> orgdb = "org.Hs.eg.db"
> library(clusterProfiler)
> library(DOSE)
> library(orgdb, character.only = TRUE)
> library(biomaRt)
> mart = biomaRt::useMart(biomart = "ensembl", dataset = biomart_dataset)
> entrez = biomaRt::getBM(attributes = c("entrezgene", "hgnc_symbol"), mart = mart)
> entrez$entrezgene = as.character(entrez$entrezgene)
> entrez$gene = entrez$hgnc_symbol
> gsea_cp = function(res, comparison) {
+     res = data.frame(res)
+     res$ensembl_gene_id = rownames(res)
+     res = res %>% left_join(entrez, by = "gene") %>% filter(!is.na(entrezgene))
+     lfc = data.frame(res)[, "log2FoldChange"]
+     lfcse = data.frame(res)[, "lfcSE"]
+     genes = lfc/lfcse
+     names(genes) = res$entrezgene
+     genes = genes[order(genes, decreasing = TRUE)]
+     genes = genes[order(genes, decreasing = TRUE)]
+     cc = summary(gseGO(genes, ont = "CC", OrgDb = orgdb, nPerm = 500, minGSSize = 10, 
+         pvalueCutoff = 1, pAdjustMethod = "BH", verbose = TRUE))
+     cc$ont = "CC"
+     mf = summary(gseGO(genes, ont = "MF", OrgDb = orgdb, nPerm = 500, minGSSize = 10, 
+         pvalueCutoff = 1, pAdjustMethod = "BH", verbose = TRUE))
+     mf$ont = "MF"
+     bp = summary(gseGO(genes, ont = "bp", OrgDb = orgdb, nPerm = 500, minGSSize = 10, 
+         pvalueCutoff = 1, pAdjustMethod = "BH", verbose = TRUE))
+     bp$ont = "BP"
+     kg = summary(gseKEGG(geneList = genes, minGSSize = 10, organism = keggname, 
+         nPerm = 500, pvalueCutoff = 1, verbose = TRUE))
+     kg$ont = "KG"
+     combined = rbind(cc, mf, bp, kg)
+     if (orgdb == "org.Hs.eg.db") {
+         do = summary(gseDO(geneList = genes, nPerm = 500, minGSSize = 10, pvalueCutoff = 1, 
+             pAdjustMethod = "BH", verbose = TRUE))
+         do$ont = "DO"
+         combined = rbind(combined, do)
+     }
+     combined$comparison = comparison
+     return(combined)
+ }
> writeres = function(res, filename) {
+     write.table(res, file = filename, quote = FALSE, sep = ",", row.names = FALSE, 
+         col.names = TRUE)
+ }

Monocytes

RR vs HC

> monorr_vs_hc_gsea = gsea_cp(monorr_vs_hc, "mono-rr-vs-hc")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> monorr_vs_hc_gsea_sig = subset(monorr_vs_hc_gsea, qvalues < 0.05)
> writeres(monorr_vs_hc_gsea_sig, "mono-rr-vs-hc-gsea.csv")
> sigplot = monorr_vs_hc_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

Monocytes, RR vs HC, GSEA

SP vs HC

> monosp_vs_hc_gsea = gsea_cp(monosp_vs_hc, "mono-sp-vs-hc")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> monosp_vs_hc_gsea_sig = subset(monosp_vs_hc_gsea, qvalues < 0.05)
> writeres(monosp_vs_hc_gsea_sig, "mono-sp-vs-hc-gsea.csv")
> sigplot = monosp_vs_hc_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

Monocytes, SP vs HC, GSEA

SP vs RR

> monosp_vs_rr_gsea = gsea_cp(monosp_vs_rr, "mono-sp-vs-rr")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> monosp_vs_rr_gsea_sig = subset(monosp_vs_rr_gsea, qvalues < 0.05)
> writeres(monosp_vs_rr_gsea_sig, "mono-sp-vs-rr-gsea.csv")
> sigplot = monosp_vs_rr_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

Monocytes, SP vs RR, GSEA

T-cells

RR vs HC

> tcellrr_vs_hc_gsea = gsea_cp(tcellrr_vs_hc, "tc-rr-vs-hc")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> tcellrr_vs_hc_gsea_sig = subset(tcellrr_vs_hc_gsea, qvalues < 0.05)
> writeres(tcellrr_vs_hc_gsea_sig, "tcell-rr-vs-hc-gsea.csv")
> sigplot = tcellrr_vs_hc_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

T-cells, RR vs HC, GSEA

SP vs HC

> tcellsp_vs_hc_gsea = gsea_cp(tcellsp_vs_hc, "tcell-sp-vs-hc")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> tcellsp_vs_hc_gsea_sig = subset(tcellsp_vs_hc_gsea, qvalues < 0.05)
> writeres(tcellsp_vs_hc_gsea_sig, "tcell-sp-vs-hc-gsea.csv")
> sigplot = tcellsp_vs_hc_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

T-cells, SP vs HC, GSEA

SP vs RR

> tcellsp_vs_rr_gsea = gsea_cp(tcellsp_vs_rr, "tcell-sp-vs-rr")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> tcellsp_vs_rr_gsea_sig = subset(tcellsp_vs_rr_gsea, qvalues < 0.05)
> writeres(tcellsp_vs_rr_gsea_sig, "tcell-sp-vs-rr-gsea.csv")
> sigplot = tcellsp_vs_rr_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

T-cells, SP vs RR, GSEA

MS vs Control, Monocytes

> monoaffected_gsea = gsea_cp(monoaffected, "mono-ms-vs-control-gsea")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> monoaffected_gsea_sig = subset(monoaffected_gsea, qvalues < 0.05)
> writeres(monoaffected_gsea_sig, "mono-ms-vs-control-gsea.csv")
> sigplot = monoaffected_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

Monocytes, MS vs affected GSEA

MS vs Control, Tcells

> tcellaffected_gsea = gsea_cp(tcellaffected, "tcell-ms-vs-control-gsea")
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
[1] "calculating observed enrichment scores..."
[1] "calculating permutation scores..."

  |                                                                       
  |                                                                 |   0%
[1] "calculating p values..."
[1] "done..."
> tcellaffected_gsea_sig = subset(tcellaffected_gsea, qvalues < 0.05)
> writeres(tcellaffected_gsea_sig, "tcell-ms-vs-control-gsea.csv")
> sigplot = tcellaffected_gsea_sig %>% group_by(ont) %>% summarise(nsig = n())
> ggplot(sigplot, aes(ont, nsig)) + geom_bar(stat = "identity") + xlab("Gene set type") + 
+     ylab("Number of sets significant")

Tcells, MS vs affected GSEA

Pathway overlap

These look at the overlap between the two cell types in pathways/ontologies/disease terms identified as differentially regulated by GSEA.

> overlap_gsea = function(sig1, sig2) {
+     overlap = intersect(sig1$ID, sig2$ID)
+     pathways = subset(sig1, ID %in% overlap)[, c("ID", "Description", "ont", 
+         "comparison")]
+     return(pathways)
+ }

RR vs HC

> rr_vs_hc_gsea_overlap = overlap_gsea(monorr_vs_hc_gsea_sig, tcellrr_vs_hc_gsea_sig)
> knitr::kable(rr_vs_hc_gsea_overlap)
ID Description ont comparison
hsa01200 hsa01200 Carbon metabolism KG mono-rr-vs-hc
hsa04146 hsa04146 Peroxisome KG mono-rr-vs-hc
hsa03050 hsa03050 Proteasome KG mono-rr-vs-hc
hsa05134 hsa05134 Legionellosis KG mono-rr-vs-hc
hsa00020 hsa00020 Citrate cycle (TCA cycle) KG mono-rr-vs-hc
> write.table(rr_vs_hc_gsea_overlap, file = "rr-vs-hc-gsea-overlap.csv", sep = ",", 
+     row.names = FALSE, col.names = TRUE, quote = FALSE)

RR vs HC GSEA overlap

SP vs HC

> sp_vs_hc_gsea_overlap = overlap_gsea(monosp_vs_hc_gsea_sig, tcellsp_vs_hc_gsea_sig)
> knitr::kable(sp_vs_hc_gsea_overlap)
ID Description ont comparison
GO:0005911 GO:0005911 cell-cell junction CC mono-sp-vs-hc
GO:0000313 GO:0000313 organellar ribosome CC mono-sp-vs-hc
GO:0005761 GO:0005761 mitochondrial ribosome CC mono-sp-vs-hc
GO:0000314 GO:0000314 organellar small ribosomal subunit CC mono-sp-vs-hc
GO:0005763 GO:0005763 mitochondrial small ribosomal subunit CC mono-sp-vs-hc
GO:0044437 GO:0044437 vacuolar part CC mono-sp-vs-hc
GO:0005840 GO:0005840 ribosome CC mono-sp-vs-hc
GO:0030880 GO:0030880 RNA polymerase complex CC mono-sp-vs-hc
GO:0000323 GO:0000323 lytic vacuole CC mono-sp-vs-hc
GO:0005764 GO:0005764 lysosome CC mono-sp-vs-hc
GO:0005759 GO:0005759 mitochondrial matrix CC mono-sp-vs-hc
hsa04510 hsa04510 Focal adhesion KG mono-sp-vs-hc
hsa00190 hsa00190 Oxidative phosphorylation KG mono-sp-vs-hc
hsa04810 hsa04810 Regulation of actin cytoskeleton KG mono-sp-vs-hc
hsa05205 hsa05205 Proteoglycans in cancer KG mono-sp-vs-hc
hsa04020 hsa04020 Calcium signaling pathway KG mono-sp-vs-hc
hsa05206 hsa05206 MicroRNAs in cancer KG mono-sp-vs-hc
hsa05012 hsa05012 Parkinson’s disease KG mono-sp-vs-hc
hsa04514 hsa04514 Cell adhesion molecules (CAMs) KG mono-sp-vs-hc
hsa04010 hsa04010 MAPK signaling pathway KG mono-sp-vs-hc
hsa05322 hsa05322 Systemic lupus erythematosus KG mono-sp-vs-hc
hsa04024 hsa04024 cAMP signaling pathway KG mono-sp-vs-hc
hsa04917 hsa04917 Prolactin signaling pathway KG mono-sp-vs-hc
hsa05016 hsa05016 Huntington’s disease KG mono-sp-vs-hc
hsa05152 hsa05152 Tuberculosis KG mono-sp-vs-hc
hsa04720 hsa04720 Long-term potentiation KG mono-sp-vs-hc
hsa05160 hsa05160 Hepatitis C KG mono-sp-vs-hc
hsa04725 hsa04725 Cholinergic synapse KG mono-sp-vs-hc
hsa04022 hsa04022 cGMP-PKG signaling pathway KG mono-sp-vs-hc
hsa04911 hsa04911 Insulin secretion KG mono-sp-vs-hc
hsa04722 hsa04722 Neurotrophin signaling pathway KG mono-sp-vs-hc
hsa04918 hsa04918 Thyroid hormone synthesis KG mono-sp-vs-hc
hsa04972 hsa04972 Pancreatic secretion KG mono-sp-vs-hc
hsa03008 hsa03008 Ribosome biogenesis in eukaryotes KG mono-sp-vs-hc
hsa05416 hsa05416 Viral myocarditis KG mono-sp-vs-hc
hsa04970 hsa04970 Salivary secretion KG mono-sp-vs-hc
hsa03013 hsa03013 RNA transport KG mono-sp-vs-hc
hsa04145 hsa04145 Phagosome KG mono-sp-vs-hc
hsa05168 hsa05168 Herpes simplex infection KG mono-sp-vs-hc
hsa04142 hsa04142 Lysosome KG mono-sp-vs-hc
hsa04750 hsa04750 Inflammatory mediator regulation of TRP channels KG mono-sp-vs-hc
hsa05140 hsa05140 Leishmaniasis KG mono-sp-vs-hc
hsa05032 hsa05032 Morphine addiction KG mono-sp-vs-hc
hsa04912 hsa04912 GnRH signaling pathway KG mono-sp-vs-hc
hsa05132 hsa05132 Salmonella infection KG mono-sp-vs-hc
hsa04146 hsa04146 Peroxisome KG mono-sp-vs-hc
hsa04971 hsa04971 Gastric acid secretion KG mono-sp-vs-hc
hsa00280 hsa00280 Valine, leucine and isoleucine degradation KG mono-sp-vs-hc
hsa00480 hsa00480 Glutathione metabolism KG mono-sp-vs-hc
hsa04630 hsa04630 Jak-STAT signaling pathway KG mono-sp-vs-hc
hsa04910 hsa04910 Insulin signaling pathway KG mono-sp-vs-hc
hsa05221 hsa05221 Acute myeloid leukemia KG mono-sp-vs-hc
hsa05200 hsa05200 Pathways in cancer KG mono-sp-vs-hc
hsa04660 hsa04660 T cell receptor signaling pathway KG mono-sp-vs-hc
hsa05150 hsa05150 Staphylococcus aureus infection KG mono-sp-vs-hc
hsa04270 hsa04270 Vascular smooth muscle contraction KG mono-sp-vs-hc
hsa04612 hsa04612 Antigen processing and presentation KG mono-sp-vs-hc
hsa05144 hsa05144 Malaria KG mono-sp-vs-hc
hsa00071 hsa00071 Fatty acid degradation KG mono-sp-vs-hc
hsa04120 hsa04120 Ubiquitin mediated proteolysis KG mono-sp-vs-hc
> write.table(sp_vs_hc_gsea_overlap, file = "sp-vs-hc-gsea-overlap.csv", sep = ",", 
+     row.names = FALSE, col.names = TRUE, quote = FALSE)

SP vs HC GSEA overlap

SP vs RR

> sp_vs_rr_gsea_overlap = overlap_gsea(monosp_vs_rr_gsea_sig, tcellsp_vs_rr_gsea_sig)
> knitr::kable(sp_vs_rr_gsea_overlap)

ID Description ont comparison — ———— —- ———–

> write.table(sp_vs_rr_gsea_overlap, file = "sp-vs-rr-gsea-overlap.csv", sep = ",", 
+     row.names = FALSE, col.names = TRUE, quote = FALSE)

SP vs RR GSEA overlap

MS vs Control

> ms_gsea_overlap = overlap_gsea(monoaffected_gsea_sig, tcellaffected_gsea_sig)
> knitr::kable(ms_gsea_overlap)
ID Description ont comparison
GO:0000313 GO:0000313 organellar ribosome CC mono-ms-vs-control-gsea
GO:0005761 GO:0005761 mitochondrial ribosome CC mono-ms-vs-control-gsea
GO:0005759 GO:0005759 mitochondrial matrix CC mono-ms-vs-control-gsea
GO:0000314 GO:0000314 organellar small ribosomal subunit CC mono-ms-vs-control-gsea
GO:0005763 GO:0005763 mitochondrial small ribosomal subunit CC mono-ms-vs-control-gsea
hsa04380 hsa04380 Osteoclast differentiation KG mono-ms-vs-control-gsea
hsa04912 hsa04912 GnRH signaling pathway KG mono-ms-vs-control-gsea
hsa04919 hsa04919 Thyroid hormone signaling pathway KG mono-ms-vs-control-gsea
hsa05206 hsa05206 MicroRNAs in cancer KG mono-ms-vs-control-gsea
hsa05152 hsa05152 Tuberculosis KG mono-ms-vs-control-gsea
hsa05416 hsa05416 Viral myocarditis KG mono-ms-vs-control-gsea
hsa04146 hsa04146 Peroxisome KG mono-ms-vs-control-gsea
hsa05322 hsa05322 Systemic lupus erythematosus KG mono-ms-vs-control-gsea
hsa00280 hsa00280 Valine, leucine and isoleucine degradation KG mono-ms-vs-control-gsea
hsa05012 hsa05012 Parkinson’s disease KG mono-ms-vs-control-gsea
hsa05140 hsa05140 Leishmaniasis KG mono-ms-vs-control-gsea
hsa00480 hsa00480 Glutathione metabolism KG mono-ms-vs-control-gsea
hsa04911 hsa04911 Insulin secretion KG mono-ms-vs-control-gsea
hsa05010 hsa05010 Alzheimer’s disease KG mono-ms-vs-control-gsea
hsa04971 hsa04971 Gastric acid secretion KG mono-ms-vs-control-gsea
hsa01200 hsa01200 Carbon metabolism KG mono-ms-vs-control-gsea
hsa05016 hsa05016 Huntington’s disease KG mono-ms-vs-control-gsea
hsa05164 hsa05164 Influenza A KG mono-ms-vs-control-gsea
DOID:2468 DOID:2468 psychotic disorder DO mono-ms-vs-control-gsea
DOID:1579 DOID:1579 respiratory system disease DO mono-ms-vs-control-gsea
DOID:10652 DOID:10652 Alzheimer’s disease DO mono-ms-vs-control-gsea
DOID:680 DOID:680 tauopathy DO mono-ms-vs-control-gsea
DOID:0050161 DOID:0050161 lower respiratory tract disease DO mono-ms-vs-control-gsea
DOID:114 DOID:114 heart disease DO mono-ms-vs-control-gsea
DOID:7148 DOID:7148 rheumatoid arthritis DO mono-ms-vs-control-gsea
DOID:1936 DOID:1936 atherosclerosis DO mono-ms-vs-control-gsea
DOID:2348 DOID:2348 arteriosclerotic cardiovascular disease DO mono-ms-vs-control-gsea
DOID:2320 DOID:2320 obstructive lung disease DO mono-ms-vs-control-gsea
DOID:2349 DOID:2349 arteriosclerosis DO mono-ms-vs-control-gsea
DOID:850 DOID:850 lung disease DO mono-ms-vs-control-gsea
DOID:10763 DOID:10763 hypertension DO mono-ms-vs-control-gsea
DOID:5844 DOID:5844 myocardial infarction DO mono-ms-vs-control-gsea
DOID:1724 DOID:1724 duodenal ulcer DO mono-ms-vs-control-gsea
DOID:3083 DOID:3083 chronic obstructive pulmonary disease DO mono-ms-vs-control-gsea
DOID:26 DOID:26 pancreas disease DO mono-ms-vs-control-gsea
DOID:3213 DOID:3213 demyelinating disease DO mono-ms-vs-control-gsea
DOID:4989 DOID:4989 pancreatitis DO mono-ms-vs-control-gsea
DOID:3393 DOID:3393 coronary artery disease DO mono-ms-vs-control-gsea
DOID:3905 DOID:3905 lung carcinoma DO mono-ms-vs-control-gsea
DOID:3908 DOID:3908 non-small cell lung carcinoma DO mono-ms-vs-control-gsea
> write.table(ms_gsea_overlap, file = "ms-gsea-overlap.csv", sep = ",", row.names = FALSE, 
+     col.names = TRUE, quote = FALSE)

MS vs control GSEA overlap

Summary

Normalized counts

> nmonocounts = counts(monodds, normalized = TRUE)
> nmonocounts = cbind(data.frame(gene = rownames(nmonocounts)), nmonocounts)
> write.table(nmonocounts, file = "monocounts_normalized.csv", row.names = FALSE, 
+     col.names = TRUE, sep = ",", quote = FALSE)
> ntcellcounts = counts(tcelldds, normalized = TRUE)
> ntcellcounts = cbind(data.frame(gene = rownames(ntcellcounts)), ntcellcounts)
> write.table(ntcellcounts, file = "tcellcounts_normalized.csv", row.names = FALSE, 
+     col.names = TRUE, sep = ",", quote = FALSE)

Normalized counts for monocytes

Normalized counts for T cells